32 research outputs found

    Revisiting Data Complexity Metrics Based on Morphology for Overlap and Imbalance: Snapshot, New Overlap Number of Balls Metrics and Singular Problems Prospect

    Full text link
    Data Science and Machine Learning have become fundamental assets for companies and research institutions alike. As one of its fields, supervised classification allows for class prediction of new samples, learning from given training data. However, some properties can cause datasets to be problematic to classify. In order to evaluate a dataset a priori, data complexity metrics have been used extensively. They provide information regarding different intrinsic characteristics of the data, which serve to evaluate classifier compatibility and a course of action that improves performance. However, most complexity metrics focus on just one characteristic of the data, which can be insufficient to properly evaluate the dataset towards the classifiers' performance. In fact, class overlap, a very detrimental feature for the classification process (especially when imbalance among class labels is also present) is hard to assess. This research work focuses on revisiting complexity metrics based on data morphology. In accordance to their nature, the premise is that they provide both good estimates for class overlap, and great correlations with the classification performance. For that purpose, a novel family of metrics have been developed. Being based on ball coverage by classes, they are named after Overlap Number of Balls. Finally, some prospects for the adaptation of the former family of metrics to singular (more complex) problems are discussed.Comment: 23 pages, 9 figures, preprin

    mldr.resampling: Efficient Reference Implementations of Multilabel Resampling Algorithms

    Full text link
    Resampling algorithms are a useful approach to deal with imbalanced learning in multilabel scenarios. These methods have to deal with singularities in the multilabel data, such as the occurrence of frequent and infrequent labels in the same instance. Implementations of these methods are sometimes limited to the pseudocode provided by their authors in a paper. This Original Software Publication presents mldr.resampling, a software package that provides reference implementations for eleven multilabel resampling methods, with an emphasis on efficiency since these algorithms are usually time-consuming

    Explotación de la potencia de procesamiento mediante paralelismo: un recorrido histórico hasta la GPGPU

    Get PDF
    La mejora en los sistemas de fabricación de semiconductores, con escalas de integración crecientes durante décadas, ha contribuido a incrementar de forma espectacular la potencia de los sistemas de cómputo en sus diversas formas, ordenadores personales y portátiles, móviles, tabletas, consolas, etc. Esa evolución, no obstante, también ha encontrado obstáculos por el camino que, entre otros aspectos, acabaron hace varios años con la escalada en las frecuencias de reloj. En la actualidad la potencia de un procesador ya no se mide exclusivamente en GHz, sino que también influyen factores como el número de núcleos de procesamiento y el diseño de estos. En el presente artículo se lleva a cabo un recorrido histórico de cómo el paralelismo ha ido adecuándose al hardware disponible en cada momento con el objetivo de obtener el mayor provecho del mismo.Due to the improvement of semiconductor manufacturing technologies, and higher integration scales in the last decades, the power of computing devices has experienced an impressive growth. However, some obstacles have been also found along the way. As a consequence, the battle for reaching higher clock frequencies almost ended a few years ago. Nowadays, the power of processors is not measured exclusively using GHz. Other factors, as the number of cores and their inner design, also have a large impact. This paper provides an historical review on how parallelism techniques have been adapted over time to overcome these changes aiming to better exploit the available hardware.Universidad de Granada: Departamento de Arquitectura y Tecnología de Computadores; Vicerrectorado para la Garantía de la Calidad

    Evolución tecnológica del hardware de vídeo y las GPU en los ordenadores personales

    Get PDF
    En este artículo se ofrece una revisión de los hitos más importantes en la evolución del hardware gráfico. La comunicación entre los ordenadores y las personas ha ido avanzando a lo largo del tiempo, alcanzando la interactividad con la aparición de los sistemas de tiempo compartido a principios de la década de los 60 del siglo pasado. Los ordenadores personales, cuya expansión se inició casi dos décadas después, adoptaron desde un inicio la visualización de información en una pantalla como medio principal de comunicación con el usuario. El hardware a cargo de esa tarea ha ido evolucionando paulatinamente hasta, en la actualidad, convertirse en parte indispensable de la arquitectura del computador, hasta tal punto que una gran parte de los ordenadores portátiles y de sobremesa incorporan el hardware gráfico en el mismo circuito integrado que aloja al microprocesador.This article provides a review of the most important milestones in the evolution of graphics hardware. Communication between computers and people has been advancing over time, reaching interactivity with the emergence of timesharing systems in the early 1960s. Personal computers, whose expansion began almost two decades later, used the visualization of information on a screen as the main means of communication with the user from the very beginning. The hardware in charge of this task has gradually evolved to become an indispensable part of the computer architecture, to such an extent that a large part of laptops and desktop computers incorporate the graphic hardware into the same integrated circuit that houses the microprocessor.Universidad de Granada: Departamento de Arquitectura y Tecnología de Computadores; Vicerrectorado para la Garantía de la Calidad
    corecore